139 research outputs found

    Mobile Data Management

    Get PDF
    The management of data in the mobile computing environment offers new challenging problems. Existing software needs to be upgraded to accommodate this environment. To do so, the critical parameters need to be understood and defined. We have surveyed some problems and existing solution

    Controlling Web Query Execution in a Web Warehouse

    Get PDF
    Most of the contemporary Web query systems have limited capabilities in controlling Web query execution. Such query facility is important as it gives us an opportunity to optimize the evaluation of a Web query. We address this issue in the context of our Web warehousing system called WHOWEDA (Warehouse Of Web Data). Specifically, we investigate different types of constraints (related to query execution) which may be imposed on a Web query such as number of query results, time of execution, restrict the evaluation of a query to specified set of Web sites, etc. An important feature of our approach is that it attempts to address the query evaluation issues which may arise due to the existence of broken links and forms in the Web

    ARENA: Towards Informative Alternative Query Plan Selection for Database Education

    Full text link
    A key learning goal of learners taking database systems course is to understand how SQL queries are processed in an RDBMS in practice. To this end, comprehension of the cost-based comparison of different plan choices to select the query execution plan (QEP) of a query is paramount. Unfortunately, off-the-shelf RDBMS typically only expose the selected QEP to users without revealing information about representative alternative query plans considered during QEP selection in a learner-friendly manner, hindering the learning process. In this paper, we present a novel end-to-end and generic framework called ARENA that facilitates exploration of informative alternative query plans of a given SQL query to aid the comprehension of QEP selection. Under the hood, ARENA addresses a novel problem called alternative plan selection problem (TIPS) which aims to discover a set of k alternative plans from the underlying plan space so that the plan interestingness of the set is maximized. Specifically, we explore two variants of the problem, namely batch TIPS and incremental TIPS, to cater to diverse set of learners. Due to the computational hardness of the problem, we present a 2 approximation algorithm to address it efficiently. Exhaustive experimental study with real-world learners demonstrates the effectiveness of arena in enhancing learners' understanding of the alternative plan choices considered during QEP selection.Comment: Add a link to access our ARENA system on the third pag

    Association Rules for Web Data Mining in WHOWEDA

    Get PDF
    The authors discuss association rules which can be discovered from Web data. The association rules are discussed within the scope of our WHOWEDA (warehouse of Web data) project. WHOWEDA is supported by a Web data model and a set of algebraic operators. The Web data model allows a uniform and integrated view of Web data gathered using a user\u27\u27s query graph. A user\u27\u27s query graph describes the query by example (what the user perceives as the query) and the Web coupling query gathers instances of such a query graph from the Web and stores them in the form of subgraphs (called Web tuples) in a Web table. We discuss association rules within this domain. An association rule defines an association between the nodes and links attributes of Web tuples within a Web table. There are two different classes of association rules that can be developed from data in a Web table. There are two different classes of association rules that can be developed from data in a Web table. Node-to-node associations are those rules that relate the content (defined by metadata attributes) between two or more nodes within a Web tuple. Link associations are rules that show the connectivity of different URLs. Distinguishing the two types of associations provides a view of the structure of the Web data. The goal of performing Web association mining on Web data is to better organize searching patterns through hyperlinked document

    DKWS: A Distributed System for Keyword Search on Massive Graphs (Complete Version)

    Full text link
    Due to the unstructuredness and the lack of schemas of graphs, such as knowledge graphs, social networks, and RDF graphs, keyword search for querying such graphs has been proposed. As graphs have become voluminous, large-scale distributed processing has attracted much interest from the database research community. While there have been several distributed systems, distributed querying techniques for keyword search are still limited. This paper proposes a novel distributed keyword search system called \DKWS. First, we \revise{present} a {\em monotonic} property with keyword search algorithms that guarantees correct parallelization. Second, we present a keyword search algorithm as monotonic backward and forward search phases. Moreover, we propose new tight bounds for pruning nodes being searched. Third, we propose a {\em notify-push} paradigm and \PINE {\em programming model} of \DKWS. The notify-push paradigm allows {\em asynchronously} exchanging the upper bounds of matches across the workers and the coordinator in \DKWS. The \PINE programming model naturally fits keyword search algorithms, as they have distinguished phases, to allow {\em preemptive} searches to mitigate staleness in a distributed system. Finally, we investigate the performance and effectiveness of \DKWS through experiments using real-world datasets. We find that \DKWS is up to two orders of magnitude faster than related techniques, and its communication costs are 7.67.6 times smaller than those of other techniques

    Influence Maximization in Social Networks: A Survey

    Full text link
    Online social networks have become an important platform for people to communicate, share knowledge and disseminate information. Given the widespread usage of social media, individuals' ideas, preferences and behavior are often influenced by their peers or friends in the social networks that they participate in. Since the last decade, influence maximization (IM) problem has been extensively adopted to model the diffusion of innovations and ideas. The purpose of IM is to select a set of k seed nodes who can influence the most individuals in the network. In this survey, we present a systematical study over the researches and future directions with respect to IM problem. We review the information diffusion models and analyze a variety of algorithms for the classic IM algorithms. We propose a taxonomy for potential readers to understand the key techniques and challenges. We also organize the milestone works in time order such that the readers of this survey can experience the research roadmap in this field. Moreover, we also categorize other application-oriented IM studies and correspondingly study each of them. What's more, we list a series of open questions as the future directions for IM-related researches, where a potential reader of this survey can easily observe what should be done next in this field

    Cost-benefit Analysis of Web Bag in a Web Warehouse

    Get PDF
    Sets and bags are closely related structures and have been studied in relational databases. A bag is different from a set in that it is sensitive to the number of times an element occurs, while a set is not. In this paper, we introduce the concept of a Web bag in the context of a World Wide Web warehouse called WHOWEDA (WareHouse Of WEb DAta) which we are currently building. Informally, a Web bag is a Web table which allows multiple occurrences of identical Web types. A Web bag helps one to discover useful knowledge from a Web table, such as visible documents or Web sites (i.e. documents/sites which can be reached by many paths), luminous documents (i.e. documents with many outgoing links) and luminous paths (i.e. frequently traversed paths). In this paper, we provide a cost-benefit analysis of materializing Web bags as compared to Web tables with distinct Web tuple

    Reducing Cognitive Overheads in a Web Warehouse using Reverse-Osmosis

    Get PDF
    This paper provides a quantitative analysis of reducing cognitive overheads in a Web warehouse using an important class of operation called reverse osmosis. The analysis is used to examine two different cognitive overheads of locating relevant nodes or information and display time of a Web table. A reverse-osmosis operation enables us to eliminate in relevant information from a collection of Web documents stored in the form of a Web table. We call such an operation reverse-osmosis because it is analogous to the reverse osmosis process in the field of water purification. We discuss a formal algorithm of the reverse-osmosis operatio
    • …
    corecore